m-Bonsai: a Practical Compact Dynamic Trie

نویسندگان

  • Andreas Poyias
  • Simon J. Puglisi
  • Rajeev Raman
چکیده

We consider the problem of implementing a space-efficient dynamic trie, with an emphasis on good practical performance. For a trie with n nodes with an alphabet of size σ, the informationtheoretic lower bound is n logσ + O(n) bits. The Bonsai data structure is a compact trie proposed by Darragh et al. (Softw., Pract. Exper. 23(3), 1993, pp. 277–291). Its disadvantages include the user having to specify an upper bound M on the trie size in advance (which cannot be changed easily after initalization), a space usage of M log σ + O(M log logM) (which is asymptotically non-optimal for smaller σ or if n ≪ M) and a lack of support for deletions. It supports traversal and update operations in O(1/ǫ) expected time (based on assumptions about the behaviour of hash functions), where ǫ = (M −n)/M and has excellent speed performance in practice. We propose an alternative, m-Bonsai, that addresses the above problems, obtaining a trie that uses (1 + β)n(log σ + O(1)) bits in expectation, and supports traversal and update operations in O(1/β) expected time and O(1/β) amortized expected time, for any user-specified parameter β > 0 (again based on assumptions about the behaviour of hash functions). We give an implementation of m-Bonsai which uses considerably less memory and is slightly faster than the original Bonsai.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Improved Practical Compact Dynamic Tries

We consider the problem of implementing a dynamic trie with an emphasis on good practical performance. For a trie with n nodes with an alphabet of size σ, the information-theoretic lower bound is n log σ + O(n) bits. The Bonsai data structure [1] supports trie operations in O(1) expected time (based on assumptions about the behaviour of hash functions). While its practical speed performance is ...

متن کامل

Bonsai: a Compact Representation of Trees

This paper shows how trees can be stored in a very compact form, called ‘Bonsai’, using hash tables. A method is described that is suitable for large trees that grow monotonically within a predefined maximum size limit. Using it, pointers in any tree can be represented within 6 + log2n bits per node where n is the maximum number of children a node can have. We first describe a general way of ...

متن کامل

Practical Evaluation of Lempel-Ziv-78 and Lempel-Ziv-Welch Tries

We present the first thorough practical study of the Lempel-Ziv-78 and the Lempel-Ziv-Welch computation based on trie data structures. With a careful selection of trie representations we can beat well-tuned popular trie data structures like Judy, m-Bonsai or Cedar.

متن کامل

Compact Suffix Trees Resemble PATRICIA Tries: Limiting Distribution of the Depth

Suffix trees are the most frequently used data structures in algorithms on words. In this paper, we consider the depth of a compact suffix tree, also known as the PAT tree, under some simple probabilistic assumptions. For a biased memoryless source, we prove that the limiting distribution for the depth in a PAT tree is the same as the limiting distribution for the depth in a PATRICIA trie, even...

متن کامل

Faster Dynamic Compact Tries with Applications to Sparse Suffix Tree Construction and Other String Problems

The dynamic compact trie is a fundamental data structure for a wide range of string processing problems. Jansson, Sadakane, and Sung (LNCS 4855, pp.424-435, FSTTCS 2007) presented the dynamic uncompacted trie data structure of n nodes in O(n log σ) space supporting pattern matching in O((|P |/α)f(n)) time and insert/delete operations in O(f(n)) time, where f(n) = ((log logn)/log log logn) is th...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1704.05682  شماره 

صفحات  -

تاریخ انتشار 2017